Search CORE

74 research outputs found

Efficient learning in Approximate Bayesian Computation

Author: Mohammed Sedki
Pierre Pudlo
Publication venue
Publication date: 13/05/2011
Field of study

Efficient learning in Approximate Bayesian Computatio

Nature Precedings

Operator norm convergence of spectral clustering on level sets

Author: Pelletier Bruno
Pudlo Pierre
Publication venue
Publication date: 11/02/2010
Field of study

Following Hartigan, a cluster is defined as a connected component of the t-level set of the underlying density, i.e., the set of points for which the density is greater than t. A clustering algorithm which combines a density estimate with spectral clustering techniques is proposed. Our algorithm is composed of two steps. First, a nonparametric density estimate is used to extract the data points for which the estimated density takes a value greater than t. Next, the extracted points are clustered based on the eigenvectors of a graph Laplacian matrix. Under mild assumptions, we prove the almost sure convergence in operator norm of the empirical graph Laplacian operator associated with the algorithm. Furthermore, we give the typical behavior of the representation of the dataset into the feature space, which establishes the strong consistency of our proposed algorithm

arXiv.org e-Print Archive

HAL Descartes

HAL-Rennes 1

The Normalized Graph Cut and Cheeger Constant: from Discrete to Continuous

Author: Arias-Castro Ery
Pelletier Bruno
Pudlo Pierre
Publication venue: 'Applied Probability Trust'
Publication date: 09/06/2011
Field of study

Let M be a bounded domain of a Euclidian space with smooth boundary. We relate the Cheeger constant of M and the conductance of a neighborhood graph defined on a random sample from M. By restricting the minimization defining the latter over a particular class of subsets, we obtain consistency (after normalization) as the sample size increases, and show that any minimizing sequence of subsets has a subsequence converging to a Cheeger set of M

arXiv.org e-Print Archive

CiteSeerX

Crossref

HAL-Rennes 1

Resampling: an improvement of Importance Sampling in varying population size models

Author: Leblois Raphaël
Merle Coralie
Pudlo Pierre
Rousset François
Publication venue
Publication date: 31/01/2016
Field of study

Sequential importance sampling algorithms have been defined to estimate likelihoods in models of ancestral population processes. However, these algorithms are based on features of the models with constant population size, and become inefficient when the population size varies in time, making likelihood-based inferences difficult in many demographic situations. In this work, we modify a previous sequential importance sampling algorithm to improve the efficiency of the likelihood estimation. Our procedure is still based on features of the model with constant size, but uses a resampling technique with a new resampling probability distribution depending on the pairwise composite likelihood. We tested our algorithm, called sequential importance sampling with resampling (SISR) on simulated data sets under different demographic cases. In most cases, we divided the computational cost by two for the same accuracy of inference, in some cases even by one hundred. This study provides the first assessment of the impact of such resampling techniques on parameter inference using sequential importance sampling, and extends the range of situations where likelihood inferences can be easily performed

arXiv.org e-Print Archive

HAL AMU

INRIA a CCSD electronic archive server

HAL Descartes

HAL-IRD

HAL-CIRAD

Bayesian functional linear regression with sparse step functions

Author: Abraham Christophe
Baragatti Meïli
Grollemund Paul-Marie
Pudlo Pierre
Publication venue
Publication date: 28/04/2016
Field of study

The functional linear regression model is a common tool to determine the relationship between a scalar outcome and a functional predictor seen as a function of time. This paper focuses on the Bayesian estimation of the support of the coefficient function. To this aim we propose a parsimonious and adaptive decomposition of the coefficient function as a step function, and a model including a prior distribution that we name Bayesian functional Linear regression with Sparse Step functions (Bliss). The aim of the method is to recover areas of time which influences the most the outcome. A Bayes estimator of the support is built with a specific loss function, as well as two Bayes estimators of the coefficient function, a first one which is smooth and a second one which is a step function. The performance of the proposed methodology is analysed on various synthetic datasets and is illustrated on a black P\'erigord truffle dataset to study the influence of rainfall on the production

arXiv.org e-Print Archive

HAL AMU

HAL Descartes

Approximate Bayesian Computational methods

Author: Marin Jean-Michel
Pudlo Pierre
Robert Christian P.
Ryder Robin
Publication venue
Publication date: 27/05/2011
Field of study

Also known as likelihood-free methods, approximate Bayesian computational (ABC) methods have appeared in the past ten years as the most satisfactory approach to untractable likelihood problems, first in genetics then in a broader spectrum of applications. However, these methods suffer to some degree from calibration difficulties that make them rather volatile in their implementation and thus render them suspicious to the users of more traditional Monte Carlo methods. In this survey, we study the various improvements and extensions made to the original ABC algorithm over the recent years.Comment: 7 figure

arXiv.org e-Print Archive

Base de publications de l'université Paris-Dauphine

HAL Descartes

HAL-Polytechnique

Clustering by Estimation of Density Level Sets at a Fixed Probability

Author: Cadre Benoît
Pelletier Bruno
Pudlo Pierre
Publication venue: HAL CCSD
Publication date: 22/06/2009
Field of study

In density-based clustering methods, the clusters are defined as the connected components of the upper level sets of the underlying density

f

. In this setting, the practitioner fixes a probability

p

, and associates with it a threshold

t^{(p)}

such that the level set

\{f\geq t^{(p)}\}

has a probability

p

with respect to the distribution induced by

f

. This paper is devoted to the estimation of the threshold

t^{(p)}

, of the level set

\{f\geq t^{(p)}\}

, as well as of the number

k(t^{(p)})

of connected components of this level set. Given a nonparametric density estimate

\hat f_n

f

based on an i.i.d.

n

-sample drawn from

f

, we first propose a computationally simple estimate

t_n^{(p)}

t^{(p)}

, and we establish a concentration inequality for this estimate. Next, we consider the plug-in level set estimate

\{\hat f_n\geq t_n^{(p)}\}

, and we establish the exact convergence rate of the Lebesgue measure of the symmetric difference between

\{f \geq t^{(p)}\}

and

\{\hat f_n\geq t_n^{(p)}\}

. Finally, we propose a computationally simple graph-based estimate of

k(t^{(p)})

, which is shown to be consistent. Thus, the methodology yields a complete procedure for analyzing the grouping structure of the data, as

p

varies over

(0;1)

HAL-Rennes 1

Efficient learning in ABC algorithms

Author: Cornuet Jean-Marie
Marin Jean-Michel
Pudlo Pierre
Robert Christian P.
Sedki Mohammed
Publication venue
Publication date: 01/01/2012
Field of study

Approximate Bayesian Computation has been successfully used in population genetics to bypass the calculation of the likelihood. These methods provide accurate estimates of the posterior distribution by comparing the observed dataset to a sample of datasets simulated from the model. Although parallelization is easily achieved, computation times for ensuring a suitable approximation quality of the posterior distribution are still high. To alleviate the computational burden, we propose an adaptive, sequential algorithm that runs faster than other ABC algorithms but maintains accuracy of the approximation. This proposal relies on the sequential Monte Carlo sampler of Del Moral et al. (2012) but is calibrated to reduce the number of simulations from the model. The paper concludes with numerical experiments on a toy example and on a population genetic study of Apis mellifera, where our algorithm was shown to be faster than traditional ABC schemes

arXiv.org e-Print Archive

CiteSeerX

Base de publications de l'université Paris-Dauphine

HAL Descartes

HAL-CIRAD

HAL-Polytechnique